Ross is a Postdoctoral Research Fellow in the School of Biological Sciences at the University of Queensland. His research falls within the field of animal ecology, where he uses animal occurrence, abundance and movement information to test hypotheses and make predictions regarding individual- and population-level responses to environmental change. He’s used R for over 12 years, has published two R packages on CRAN, was lead R developer on the ZoaTrack.org project and regularly teaches R training courses at The University of Queensland and at other institutes around Australia. For more information and teaching resources, check out Ross’s website.
Vinay is an Postdoctoral Research Fellow at the Australian Institute of Marine Science. He is an ecologist that is particularly interested in using large spatio-temporal datasets to understand animal movements and distributions patterns. He has considerable experience using R to analyse and visualise large and complex spatial datasets. He has developed R code to analyse 2 and 3 dimentional movement patterns of animals using acoustic telemetry data from single study sites to continental scale arrays. Vinay’s R codes can be found on his github page
In this course you will learn how to analyse and interpret your aquatic telemetry datasets in R. This workshop will hopefully show you how R can make the processing of spatial data much quicker and easier than using standard GIS software, and can help you plot some deadly figures! At the end of this workshop you will also have the annotated R code that you can re-run at any time, share with collaborators and build on with those newly acquired data!
We designed this course not to comprehensively cover all the tools in R, but rather to teach you skills for taking your experience with R to the next level. Every new project comes with its own problems and questions and you will need to be independent, patient and creative to solve these challenges. It makes sense to invest time in becoming familiar with R, because today R is the leading platform for environmental data analysis and has some other functionalities which may surprise you!
This R workshop will be divided into 4 sessions intended to run about 1 hr 15 mins each.
The main principles we hope you will learn today are…
The process of turning raw telemetry data into publishable results is a highly involved process. Tracking data sets are becoming larger, and larger as they are being gathered over longer time periods, over larger spatial extents and at increasing temporal resolutions. While this is increasing our ability to detect subtle patterns, these data sets are becoming vast and require analytical tools that easily handle, manipulate and visualise these complex datasets.
Processing and analysing telemetry datasets can require a huge investment in time: rearranging data, removing erroneous values, purchasing, downloading and learning the new software, and running analyses. Furthermore merging together excel spreadsheets, filtering data and preparing data for statistical analyses and plotting in different software packages can introduce all sorts of errors.
There is a better way…
R is a powerful language for data wrangling and analysis because…
As R is open source, the more people we can get helping out on the R spatial mailing lists (e.g. R-sig-geo) and contributing their own spatial packages to the wider community, the more powerful R becomes!
For this course, we assume you have a basic understanding of the R environment, and working with RStudio. If you do not have experience working in R we encourage you to go through this online course that will bring you up to speed!
To access data and scripts we will work through in this course, download the OCS-2018-Rworkshop folder from GitHub. This folder contains the course documents, telemetry data and R scripts we are going to work with. To download the folder click on the green “Clone or download” dropdown menu and select “Download ZIP”
How to acces course resources
Once downloaded, the workshop folder contains the following sub folders:
First unzip this folder and extract the folder to a location on your computer that you typically store your research files.
To link with these folders, we are going to use the Project functionality of R Studio.
First, open the R Studio program on your computer and in the TOP LEFT corner of the R Studio window, click File > New Project….
Next, create a New Project using the Create Project command (available on the Projects menu and on the global toolbar):
alt text
Select Existing Directory then save the project within the “OCS-2018-Rworkshop” folder.
When a New Project is created, R Studio:
There are a number of good reasons why you should work with Projects in RStudio. 1. We may want our R scripts to be saved into a place where they link seamlessly to other documents and data files for our research project. 2. We may also want our tables, figures and statistical results to be written to locations on our computer where they are easy to locate. 3. We may work on multiple workstations (desktops, laptops) that we want our code to be completely transferable using cloud infrastructure (e.g. Dropbox, OneDrive) 4. It allows it to share work folders, data and R code with collaborators who are working on the same projects
Part of the reason R has become so popular is the vast array of packages that are freely available and highly accessible. In the last few years, the number of packages has grown exponentially > 10,000 at my last check on CRAN! These can help you to do a galaxy of different things in R, including running complex analyses, drawing beautiful figures, running R as a GIS, constructing your own R packages, building web pages and even writing introductory R course handbooks!
Let’s suppose you want to load the sp package to access this package’s incredible spatial functionality. If this package is not already installed on your machine, you can download it from the web by using the following command in R.
install.packages("sp", repos='http://cran.us.r-project.org')In this example, sp is the package to be downloaded and ‘http://cran.us.r-project.org’ is the repository where the package will be accessed from.
Multiple packages can be loaded at the same time by listing the required packages in a vector…
install.packages(c("rgdal","rgeos","adehabitatHR",
"raster","gdistance",
"leaflet","tidyverse",
'XML', 'gstat', 'Hmisc', 'checkmate',"devtools"), repos='http://cran.us.r-project.org')We then load the required packages from our computer’s R package library using the library() function.
library(sp)
library(rgdal)
library(rgeos)
library(adehabitatLT)
library(adehabitatHR)
library(leaflet)
library(tidyverse)
library(devtools)
In the first session we are going to work with a data set containing detection data from 3 Australian Blacktip Sharks (Carcharhinus tilstoni) shown in the image above. These animals were captured and tagged within Cleveland Bay, Townsville roughly one month prior to the landfall of Cyclone Yasi in 2011. Blacktip sharks were tracked using a network of acoustic hydrophones deployed in a grid pattern on the East and West side of Cleveland Bay.
Telemetry data from these sharks were analysed alongside 45 others from five species to examine movement patterns of coastal sharks before, during and after three extreme weather events (i.e. cyclones, hurricane and tropical storms) in Australia and the US. You can read more about that study here.
The web map of detection data we will explore by the end of Session 1
Before we can analyse these data, we first need to read this dataset into R. As with most acoustic detection datasets exported from VUE or other acoustic telemetry data management software, our data set is in the ‘comma sperated value’ (.csv) format.
A .csv file can simply be imported into R using the read.csv base function, and by telling R which file to load (Blacktip_ClevelandBay.csv) and where to find it (i.e. in the Data folder).
# Load the blacktip shark data using base read.csv function
blacktip <- read.csv('Data/Blacktip_ClevelandBay.csv', header = TRUE)However, loading very large datafiles can take a long time when using the read.csv function. A useful alternative is the fread function in the ‘data.table’ package, which is very effective at loading very large files such as those exported from the Vemco VUE database.
library(data.table)
# Load the blacktip shark data using data.table
blacktip <- fread('Data/Blacktip_ClevelandBay.csv', header = TRUE)
# You can also use fread to input data directly from a website URL
blacktip <- fread('https://raw.githubusercontent.com/RossDwyer/OCS-2018-Rworkshop/master/Data/Blacktip_ClevelandBay.csv')A note about Excel files
Don’t use ‘.xlsx’ or ‘.xls’ files for saving data. The problem with ‘.xls’ and ‘.xlsx’ files are that they store extra info with the data that makes files larger than necessary and Excel formats can also unwittingly reformat or alter your data!
A stable way to save your data is as a ‘.csv’ file, which stands for ‘comma separated values’. These are simply values separated by ‘commas’ and rows defined by ‘returns’. If you select ‘Save as’ in excel, you can choose ‘.csv’ as one of the options. If you open the .csv file provided in the Data folder using a text editor, you will see it is just words, numbers and commas.
The ability to quickly filter and manipulate datasets datasets is important when working with large telemetry datasets. The standard input format of tabulated data in R is a data.frame. A data.frame is a special ‘class’ of an object, where there are multiple variables, stored in named columns and multiple rows for our detections. Every variable has the same number of columns. There are simple rules to filter and select data from a data.frame. Rows and columns are indexed by using a comma as a separator df[row,column]. Try the following code:
blacktip[1,] # Provides the first row of data
blacktip[1,3] # Provides the cell value from the 1st row and 3rd column
blacktip[,3] # Provides the first column of data
blacktip[c(1,3),] # Provides the first and third date in the data frameWe can also access objects in a data.frame by their names:
blacktip$Transmitter #provides all the data for that object
blacktip$Transmitter[1:5] #provides athe first five values for that object
blacktip[,'Transmitter'] #provides all the data for that objectThe data.table package enhances the functionality of the traditional data.frame format and allows you to do more than just select data based on row and column names.
Firstly, data loaded using the fread function inputs detection data as a data.table format. This format is slightly different to the data.frame format but can still be maniuplated using the same rules as above.
To understand the enhancements of the data.table format we must first understand the general form of the data.table syntax. data.tables take the form of DT[row, column, by]. The additional by variable allows for quick and easy subsetting and aggregating. Try the following code:
# Subset rows within a data.table
blacktip[`Transmitter Name` == "Ana" & Receiver == "VR2W-104912",] # Select detections from 'Ana' detected on "VR2W-104912" receiver.
# Subset rows and columns within a data.table
blacktip[`Transmitter Name` == "Ana", .(Receiver, Latitude, Longitude),] # Subsets Receiver, Latitude and Longitude data from `Ana`
# Aggregate numbers of detections from each of the three tagged animal
blacktip[ , .(.N), by = "Transmitter Name"]The .N call is a special variable that holds the number of rows in that current group. Grouping by Transmitter Name obtains the number of rows (i.e. detections) for each shark. We can include multiple variables in the by parameter to subset our datset further:
# Aggregate numbers of detections at each reciever for each tagged animal
blacktip[ , .(.N), by = c("Transmitter Name", "Receiver")]We encourage you to use the package vignettes to explore additional functionality of the data.table package to make data subsetting and manipulation quicker and easier.
browseVignettes("data.table")%>%Now that we’ve successfully loaded in our tracking dataset, lets start having a closer look at the data using pipes %>%
magrittr package but has been imported to the tidyverse.%>% is an infix operator. This means it takes two operands, left and right.# For example, to see what class our data is in, we could use this code...
class(blacktip)
# Alternatively in the tidy verse we could use this code...
blacktip %>% class()View() or head() into pipe chain.We can have a quick look at the data by typing:
# Now insert functions into the pipe chain
blacktip %>% View()
blacktip %>% head() # first 6 rows by default
blacktip %>% tail(10) # specify we want to look at the last 10 rowsThis functionality is particularly useful if the data is very large!
Note the () around the data frame, as opposed to the [] we used for indexing. The () signify a function.
We can look at the data more closely using the nrow, ncol, length, unique, str() and summary() functions.
blacktip %>% nrow() # number of rows in the data frame
blacktip %>% ncol() # number of columns in the data frame
blacktip %>% str() # provides internal structure of an R object
blacktip %>% summary() # provides result summary of the data frame# pipes can be used for single column within data frames
blacktip$`Transmitter Name`<-
blacktip$`Transmitter Name` %>% as.factor()
# pipes are used to conduct multiple functions on the dataset in a certain order
blacktip %>%
subset(`Transmitter Name` == "Colin") %>% # subset dataset to include only detections by 'Colin'
nrow() # number of rows (i.e. detections) from 'Colin'Pipes can also be used to pre-process our data before plotting them. Lets now use pipes to plot a simple barplot of the number of Colins detections at each reciever.
blacktip %>%
subset(`Transmitter Name` == "Colin") %>% # subset dataset to include only detections by 'Colin'
with(table(`Station Name`)) %>% # create a table with the number of rows (i.e. detections) per receiver
barplot(las=2) # barplot of number of Colins detections recorded per receiverNow that you’ve fully jumped into the world of pipes, it’s time to fully introduce you to the tidyverse group of R packages. These have really revolutionised the way we work with big data and the way we visualise our results.
hadleyverse.broom, dplyr, forcats, ggplot2, haven, httr, hms, jsonlite, lubridate, magrittr, modelr, purrr, readr, readxl, stringr, tibble, rvest, tidyr, xml2
lubridatelubridate is an easy way to convert date time data into a form R can recognise it.ymd for dates; ymd_hms for date and time parsing)Currently in our blacktip dataset the “Date and Time (UTC)” column is recognised as “characters” and recorded in the Universal Coordinated Time Zone (UTC). We want to convert this to local time (UTC + 10 hours). Lets use lubridate to convert this column into the ‘POSIX’ format and into local date time.
library(lubridate)
blacktip$local.Date.time<-
blacktip$`Date and Time (UTC)` %>%
ymd_hms(tz="UTC") %>% # first convert the `Date and Time (UTC)` column into a 'POSIX' format
with_tz(tzone="Australia/Brisbane") # convert to local "Australia/Brisbane" date time (UTC + 10hrs)dplyrdplyr is the data wrangling workhorse of the tidyverse.dplyrBasic vocabulary * select() columns from a tibble * filter() to rows matching a certain condition * mutate() a tibble by changing or adding rows * arrange() rows in order * group_by() a variable * summarise() data over a group using a function
Check out this useful Cheatsheet for data wrangling.
selectWe can use the select function in dplyr to choose the columns we want to include for our analyses and plotting
# Select the rows we are interested in
blacktip <-
blacktip %>%
select(`local.Date.time`,`Latitude`,`Longitude`,`Receiver`,`Station Name`,`Transmitter Name`,`Transmitter`,`Sensor Value`) %>%
select(-`Sensor Value`)
head(blacktip)## local.Date.time Latitude Longitude Receiver Station Name
## 1: 2011-01-24 02:41:31 -19.27705 146.9030 VR2-5052 E18
## 2: 2011-01-24 02:43:35 -19.27705 146.9030 VR2-5052 E18
## 3: 2011-01-24 02:48:25 -19.27705 146.9030 VR2-5052 E18
## 4: 2011-01-24 07:07:34 -19.27527 146.8894 VR2W-104912 E2
## 5: 2011-01-24 07:12:06 -19.27527 146.8894 VR2W-104912 E2
## 6: 2011-01-24 20:57:27 -19.28930 146.8771 VR2W-104197 E1
## Transmitter Name Transmitter
## 1: Ana A69-1303-63639
## 2: Ana A69-1303-63639
## 3: Ana A69-1303-63639
## 4: Ana A69-1303-63639
## 5: Ana A69-1303-63639
## 6: Ana A69-1303-63639
filter and arrangeSubset the data to rows matching logical conditions and then arrange according to particular attributes
filter(blacktip, `Transmitter Name` == "Ana") %>%
arrange(local.Date.time) # arrange Ana's detections in chronological order
filter(blacktip, `Transmitter Name` == "Bruce") %>%
arrange(desc(local.Date.time)) # arrange Bruce's detections in descending chronological ordersummariseDetermine the total number of detections for each tagged shark
blacktip %>%
group_by(`Transmitter Name`) %>%
summarise(NumDetections = n()) # summarise number of detections per tagged shark
blacktip %>%
group_by(`Transmitter Name`, `Receiver`) %>%
summarise(NumDetections = n()) # summarise number of detections per shark at each receivermutateAdding and removing data to the data frame through a pipe
blacktip<-
blacktip %>%
mutate(date=date(local.Date.time)) %>% # adding a column to the blacktip data with date of each detection
mutate(Transmitter=NULL) # removing the `Transmitter` column
head(blacktip)## local.Date.time Latitude Longitude Receiver Station Name
## 1 2011-01-24 02:41:31 -19.27705 146.9030 VR2-5052 E18
## 2 2011-01-24 02:43:35 -19.27705 146.9030 VR2-5052 E18
## 3 2011-01-24 02:48:25 -19.27705 146.9030 VR2-5052 E18
## 4 2011-01-24 07:07:34 -19.27527 146.8894 VR2W-104912 E2
## 5 2011-01-24 07:12:06 -19.27527 146.8894 VR2W-104912 E2
## 6 2011-01-24 20:57:27 -19.28930 146.8771 VR2W-104197 E1
## Transmitter Name date
## 1 Ana 2011-01-24
## 2 Ana 2011-01-24
## 3 Ana 2011-01-24
## 4 Ana 2011-01-24
## 5 Ana 2011-01-24
## 6 Ana 2011-01-24
ggplot2ggplot2 is a powerful data visualization package for the R programming language. The package makes it very easy to generate some very impressive figures and utilise a range of colour palettes, taking care of many of the fiddly details that can make plotting graphs in R a hassle.
The system provides mappings from your data to aesthetics which are used to construct beautiful plots.
Documentation for ggplot can be found here and here.
There is also this awesome Cheetsheet for ggplot2
ggplot2 grammarThe basic idea: independently specify plot building blocks and combine them to create just about any kind of graphical display you want. Building blocks of a graph include: * data * aesthetic mapping * geometric object * statistical transformations * scales * coordinate system * position adjustments * faceting
In the below script we call the data set we have just made (blacktip) and then pipe it into the ggplot function. We than tell ggplot that we want to plot a box plot
library(ggplot2)
blacktip %>%
group_by(`Transmitter Name`,`date`) %>%
summarise(DailyDetections= n()) %>% # use summarise to calculate numbers of detections per day per animal
ggplot(mapping = aes(x = `Transmitter Name`, y = DailyDetections)) + # define the aesthetic map (what to plot)
geom_boxplot() # define the geometric object (how to plot it).. in this case a boxplotA common plot used in passive acoustic telemetry to assess temporal patterns in detection is the Abacus plot. This plot can help quickly assess which animals are being detected consistently within your array, and identify any temporal patterns in detection.
We can adapt the above script to create an abacus plot using our blacktip dataset
blacktip %>%
ggplot(mapping = aes(x = `local.Date.time`, y = as.factor(`Transmitter Name`))) +
xlab("Date") + ylab("Tag") +
geom_point()We can also use the facet_wrap function to explore the detection data a bit more and look at how animals were detected at each reciever
blacktip %>%
ggplot(mapping = aes(x = `local.Date.time`, y = as.factor(`Station Name`))) +
xlab("Date") + ylab("Receiver station") +
geom_point() +
facet_wrap(~`Transmitter Name`, nrow=1) # This time plot seperate boxplots for each sharkIn ggplot, aesthetic means “something you can see”. Aesthetic mapping (i.e., with aes()) only says that a variable should be mapped to an aesthetic. It doesn’t say how that should happen. For example, when mapping a variable to shape with aes(shape = x) you don’t say what shapes should be used. Similarly, aes(color = z) doesn’t say what colors should be used. Describing what colors/shapes/sizes etc. to use is done by modifying the corresponding scale.
In ggplot2 scales include:
Each type of geom accepts only a subset of all aesthetics–refer to the geom help pages to see what mappings each geom accepts. Aesthetic mappings are set with the aes() function.
Geometric objects are the actual marks we put on a plot. Examples include:
You can get a list of available geometric objects using the code below:
help.search("geom_", package = "ggplot2")
R offers a variety of functions for importing, manipulating, analysing and exporting spatial data. Although one might at first consider this to be the exclusive domain of GIS software, using R can frequently provide a much more lightweight, yet equally effective solution that embeds within a larger analytic workflow.
One of the tricky aspects of pulling spatial data into your analytic workflow is that there are numerous complicated data formats. In fact, even within R itself, functions from different user-contributed packages often require the data to be structured in very different ways. The good news is that efforts are underway to standardize spatial data classes in R. This movement is facilitated by sp, an important base package for spatial operations in R. It provides definitions for basic spatial classes (points, lines, polygons, pixels, and grids) in an attempt to unify the way R packages represent and manage these sorts of data. It also includes some core functions for creating and manipulating these data structures. The hope is that all spatial R packages will use (or at least provide conversions to) the ‘Spatial’ data class and it’s derivatives, as now defined in the sp package. All else being equal, we favor R functions and packages that conform to the sp standard, as these are likely to provide the greatest future utility and durability.
Here is a very useful style guide for coding using Spatial objects.
SpatialPoints and receiver locationsThe most basic spatial data object is a point, which may have 2 (X, Y) or 3 components (X, Y, Z). A single coordinate, or a collection of coordinates, may be used to define a SpatialPoints object.
In this exercise we are going to convert our receiver locations from a standard data frame object into a SpatialPointsDataFrame object.
# First load our VR2-W installation dataset for the Cleveland Bay study
statinfo <- read.csv('Data/Station_information.csv')
cb <- statinfo[statinfo$Installation%in%"Cleveland Bay",]
# Now convert the data.frame object into a SpatialPoints object
coordinates(cb) <- c("Longitude", "Latitude")
# Have a look at the created object
class(cb)## [1] "SpatialPointsDataFrame"
## attr(,"package")
## [1] "sp"
str(cb)## Formal class 'SpatialPointsDataFrame' [package "sp"] with 5 slots
## ..@ data :'data.frame': 63 obs. of 5 variables:
## .. ..$ Installation : Factor w/ 2 levels "Cleveland Bay",..: 1 1 1 1 1 1 1 1 1 1 ...
## .. ..$ Station.Name : Factor w/ 138 levels "C1","C2","C3",..: 26 25 24 3 4 2 35 8 1 5 ...
## .. ..$ Receiver : Factor w/ 138 levels "VR2W-101335",..: 1 2 5 11 12 13 14 15 17 18 ...
## .. ..$ DeploymentDate: Factor w/ 32 levels "2008-11-01","2009-04-15",..: 16 16 16 4 4 4 1 4 4 4 ...
## .. ..$ RecoveryDate : Factor w/ 40 levels "2010-11-03","2011-05-04",..: 5 4 3 7 32 21 32 32 32 32 ...
## ..@ coords.nrs : int [1:2] 6 7
## ..@ coords : num [1:63, 1:2] 147 147 147 147 147 ...
## .. ..- attr(*, "dimnames")=List of 2
## .. .. ..$ : chr [1:63] "76" "77" "78" "79" ...
## .. .. ..$ : chr [1:2] "Longitude" "Latitude"
## ..@ bbox : num [1:2, 1:2] 146.8 -19.3 147 -19.1
## .. ..- attr(*, "dimnames")=List of 2
## .. .. ..$ : chr [1:2] "Longitude" "Latitude"
## .. .. ..$ : chr [1:2] "min" "max"
## ..@ proj4string:Formal class 'CRS' [package "sp"] with 1 slot
## .. .. ..@ projargs: chr NA
Notice the class has now become a SpatialPointsDataFrame. The str() output contains lots of @ symbols which denote a different slot in this S4 R object. Typing cb@data will extract the attribute data (similar to the attribute table in ArcGIS). The X and Y locational information can now be found in the @coords slot. In addition @bbox contains the bounding box coordinates and the @pro4string contains the projection, or coordinate reference system (CRS) information.
cb@coords
cb@proj4string
cb@bbox
head(cb@data)
# alternatively use the slot command to extract different
# packages of data. As the data is stored in the data slot
slot(cb,'data') Now let’s draw a simple spatial plot of the receiver locations
# Now create a plot for the receiver stations
data.frame(cb) %>%
ggplot(mapping = aes(x = Longitude, y = Latitude)) +
xlab("Longitude") + ylab("Latitude") +
geom_point()leafletNow that we have allocated our data as a Spatial* object, we can also use the incredibly powerful Leaflet package in R to plot the locations of our hydrophones on an interactive map that you can share with your collaborators. The leaflet package utilises the open-access leaflet JavaScript library (a web programming language), to create web-based maps. We will cover a basic map with some points on it today. See Rstudio’s leaflet page for help and more mapping features, like polygons.
To get started with leaflet, first, make sure you have the leaflet and htmlwidgets packages installed, and then load it into this session:
library(leaflet)
library(htmlwidgets)We start by specifying a base layer, then we can add features to it. We will string together our layers using the pipes (%>%) command. See this page for the numerous base map options. Finally, we add some markers, located using the longitude and latitude variables in the blacktip data frame:
library(leaflet)
myleafletplot <- leaflet() %>%
# Base groups (you can add multiple basemaps)
addProviderTiles(providers$Esri.WorldImagery, group="Satellite") %>%
addProviderTiles(providers$OpenStreetMap, group="Map") %>%
# Add reciever location data
addCircles(lng=cb$Longitude, lat=cb$Latitude, fill=FALSE, color="gray", weight=8)
myleafletplot # Print the mapWe can also now add the detection data from our tagged sharks to show their positions within the array.
# First lets convert our blacktip data into a spatial points data frame object
blacktip <- read.csv('Data/Blacktip_ClevelandBay.csv', header = TRUE)
coordinates(blacktip)<-c("Longitude","Latitude")
# Subset detection data from "Ana" and remove duplicate positions
ana <-
blacktip %>%
subset(`Transmitter.Name` == "Ana") %>%
remove.duplicates() # remove duplicated positions to reduce the number of points to plot on our leaflet mapNow lets add Ana’s detection data to our leaflet map
myleafletplot %>%
# add the tag detection data
addCircleMarkers(lng=ana$Longitude, lat=ana$Latitude, weight=2,radius=4,color = "red",
stroke = FALSE, fillOpacity = 1, group = "Ana") %>% # Dont forget to assign a group to the markers
# Layers control
addLayersControl(
baseGroups= c("Satellite","Map"),
overlayGroups = c("Ana"), # add the groups you want to overlay on the base maps
options = layersControlOptions(collapsed = FALSE))# Now try adding detection data from "Colin" and "Bruce" to the mapFinally, you can save this leaflet document at an html file and share it with your friends or colleagues
library(htmlwidgets)
saveWidget(myleafletplot, file="mymap.html") # uses the htmlwidgets package
In this third session we are going to work with a data set containing detection data from 3 Estuarine Crocodiles (Crocodylus porosus) as shown in the image above. These animals were captured and tagged within the Wenlock River, Cape York. Crocodiles were dual tagged using both GPS transmitters and using a network of acoustic hydrophones deployed along the river system. You can read more about this study here
Vinay to add…
Google Earth offers a simple yet powerful way of visualising your acoustic tracking data through time. However Pulling detection datasets into Google earth can be challenging given the size of many detection files. The VTrack R package has a few handy functions for visualising your tag detections as track animations in Google Earth).
For this to work, your receiver locations MUST be in the datum WGS84 and you will need to have Google Earth downloaded on your machine. If you have not already got it, Google Earth can be downloaded for free [here]https://www.google.com/earth/download/gep/agree.html
For this workshop, we will need the most recent version of the VTrack R package from our GitHub repository. This contains some new functionality that currently is not on CRAN.
library(devtools) # Includes funct
install_github("RossDwyer/VTrack")
library(VTrack)We will then load our tracking detection dataset into the VTrack format and extract a list of transmitters.
# Load crocodile datset into VTrack archive
data(crocs)
Vcrocs <- ReadInputData(infile=crocs,
iHoursToAdd=10,
dateformat = NULL,
sVemcoFormat='1.0')
Vcrocs %>% head()## DATETIME TRANSMITTERID SENSOR1 UNITS1 RECEIVERID STATIONNAME
## 1 2008-09-01 10:00:48 139 0.20 m 103551 Unknown
## 2 2008-09-01 12:49:23 139 0.00 m 103551 Unknown
## 3 2008-09-01 13:02:51 139 0.20 m 103551 Unknown
## 4 2008-09-01 13:14:18 139 0.20 m 103551 Unknown
## 5 2008-09-01 13:27:18 139 0.20 m 103551 Unknown
## 6 2008-09-01 13:35:18 139 0.08 m 103551 Unknown
# Load Wenlock points file
data(PointsDirect_crocs)
PointsDirect_crocs %>% head()## LOCATION LATITUDE LONGITUDE RADIUS
## 1 103561 -12.25719 141.9228 100
## 2 103549 -12.25867 141.9304 100
## 3 103562 -12.25553 141.9419 100
## 4 103547 -12.25526 141.9436 100
## 5 103563 -12.27036 141.9725 100
## 6 103548 -12.27114 142.0695 100
Once we have our dataset in the VTrack archive format and a seperate Points file containing the receiver locations, we can then run VTrack’s KML creator functions. GenerateAnimationKMLFile_Track() generates a moving arrow for a single transmitter as it moves between the detection field of adjacent receiver stations.
levels(Vcrocs$TRANSMITTERID)
# NEED TO CHECK THIS WORKS
# Run the function to generate the KML for a single transmitter
GenerateAnimationKMLFile_Track(Vcrocs, # VTrack archive file
"139", # Transmitter code
PointsDirect_crocs, # points file
"Track1.kml", # file name
"cc69deb3", # colour of the track
sLocation="RECEIVERID")The file will be written in your Project folder and can be opened in Google Earth. Users can adjust the time slider to visualise individual time periods for display or can click the spanner icon to slow down the speed of the animation.
There is also a another handy function for visualising the movements of multiple animals at a time.
GenerateAnimationKMLFile_Multitag(Vcrocs,
PointsDirect_crocs,
"Croc Multi.kml",
sLocation="RECEIVERID")In this next exercise we will extract a metric to compare how far each of our tagged animals moved during the period of study. To do this, we use one of the Generate*Distance() functions within VTrack which calculates the distances between each of our hydrophone stations and assemble the distances within a matrix.
If our receivers are in open water or in an enclosed bay or dam, there may be no obvious barriers to animal movement so we could use the GenerateDirectDistance() function to generate our distance matrix.
VR2ArrayDM <- GenerateDirectDistance(PointsDirect_crocs)
VR2ArrayDM## DM 103561 103549 103562 103547 103563 103548
## 1 103561 0.0000000 0.6471471 1.881276 2.074469 5.394396 15.819379
## 2 103549 0.6471471 0.0000000 1.090251 1.282652 4.549648 14.976537
## 3 103562 1.8812763 1.0902511 0.000000 0.000000 3.512702 13.779413
## 4 103547 2.0744690 1.2826518 0.000000 0.000000 3.356438 13.593494
## 5 103563 5.3943955 4.5496478 3.512702 3.356438 0.000000 10.344637
## 6 103548 15.8193785 14.9765370 13.779413 13.593494 10.344637 0.000000
## 7 103564 19.3263822 18.4792507 17.342354 17.163531 13.747736 3.760611
## 8 103551 20.9430834 20.1043298 19.063703 18.897261 15.366617 6.512850
## 9 103557 21.5580310 20.7371723 19.789242 19.635306 16.078979 8.376548
## 10 103556 22.7669620 21.9521408 21.028185 20.877384 17.323324 9.705648
## 11 103566 24.5782507 23.7808895 22.920078 22.777795 19.242669 12.099660
## 12 103550 26.5498008 25.7662512 24.949360 24.812961 21.299637 14.387823
## 13 103565 28.9378394 28.1573622 27.347990 27.212376 23.701708 16.615428
## 14 103552 30.9965651 30.1835023 29.260079 29.108549 25.553530 17.156852
## 15 103555 32.3819385 31.5579707 30.588612 30.430790 26.875910 17.979824
## 16 103554 32.7809210 31.9530064 30.965332 30.805043 27.253680 18.176971
## 17 103560 33.9036784 33.0773337 32.096558 31.937131 28.384290 19.335772
## 18 103559 34.9619912 34.1310983 33.128016 32.965581 29.419037 20.150172
## 19 103553 37.2913375 36.4564511 35.431472 35.266074 31.729162 22.238218
## 20 103558 37.7319469 36.8931403 35.844499 35.676017 32.153396 22.474688
## 103564 103551 103557 103556 103566 103550 103565
## 1 19.326382 20.943083 21.558031 22.766962 24.578251 26.549801 28.937839
## 2 18.479251 20.104330 20.737172 21.952141 23.780889 25.766251 28.157362
## 3 17.342354 19.063703 19.789242 21.028185 22.920078 24.949360 27.347990
## 4 17.163531 18.897261 19.635306 20.877384 22.777795 24.812961 27.212376
## 5 13.747736 15.366617 16.078979 17.323324 19.242669 21.299637 23.701708
## 6 3.760611 6.512850 8.376548 9.705648 12.099660 14.387823 16.615428
## 7 0.000000 2.931700 5.196961 6.388394 8.781520 11.020657 13.098692
## 8 2.931700 0.000000 2.172210 3.268928 5.651550 7.889866 9.996114
## 9 5.196961 2.172210 0.000000 1.152558 3.523165 5.811686 8.074579
## 10 6.388394 3.268928 1.152558 0.000000 2.210228 4.488931 6.726023
## 11 8.781520 5.651550 3.523165 2.210228 0.000000 2.088907 4.402984
## 12 11.020657 7.889866 5.811686 4.488931 2.088907 0.000000 2.203678
## 13 13.098692 9.996114 8.074579 6.726023 4.402984 2.203678 0.000000
## 14 13.287779 10.500000 9.284697 8.032929 6.417234 5.151216 3.776521
## 15 14.035756 11.498302 10.629864 9.478970 8.177470 7.164322 5.866624
## 16 14.220242 11.792460 11.060138 9.955027 8.780395 7.875045 6.644070
## 17 15.378827 12.942941 12.165327 11.034062 9.749719 8.673083 7.202649
## 18 16.190297 13.901206 13.278172 12.195775 11.028359 10.029008 8.579156
## 19 18.293816 16.170599 15.684223 14.639039 13.529542 12.518597 10.979743
## 20 18.557189 16.588880 16.258082 15.271921 14.306338 13.425388 11.995031
## 103552 103555 103554 103560 103559 103553
## 1 30.996565 32.3819385 32.7809210 33.9036784 34.961991 37.291337
## 2 30.183502 31.5579707 31.9530064 33.0773337 34.131098 36.456451
## 3 29.260079 30.5886116 30.9653319 32.0965579 33.128016 35.431472
## 4 29.108549 30.4307902 30.8050429 31.9371306 32.965581 35.266074
## 5 25.553530 26.8759101 27.2536799 28.3842901 29.419037 31.729162
## 6 17.156852 17.9798237 18.1769705 19.3357717 20.150172 22.238218
## 7 13.287779 14.0357565 14.2202423 15.3788268 16.190297 18.293816
## 8 10.500000 11.4983025 11.7924596 12.9429407 13.901206 16.170599
## 9 9.284697 10.6298640 11.0601381 12.1653266 13.278172 15.684223
## 10 8.032929 9.4789702 9.9550274 11.0340619 12.195775 14.639039
## 11 6.417234 8.1774696 8.7803950 9.7497192 11.028359 13.529542
## 12 5.151216 7.1643220 7.8750455 8.6730825 10.029008 12.518597
## 13 3.776521 5.8666244 6.6440699 7.2026486 8.579156 10.979743
## 14 0.000000 1.8965910 2.6686379 3.3314273 4.700188 7.174198
## 15 1.896591 0.0000000 0.5864414 1.3743882 2.676557 5.180581
## 16 2.668638 0.5864414 0.0000000 0.9589006 2.055783 4.549720
## 17 3.331427 1.3743882 0.9589006 0.0000000 1.177639 3.645523
## 18 4.700188 2.6765574 2.0557827 1.1776389 0.000000 2.305652
## 19 7.174198 5.1805815 4.5497200 3.6455228 2.305652 0.000000
## 20 8.116282 6.0612341 5.3654575 4.5982252 3.220943 0.998162
## 103558
## 1 37.731947
## 2 36.893140
## 3 35.844499
## 4 35.676017
## 5 32.153396
## 6 22.474688
## 7 18.557189
## 8 16.588880
## 9 16.258082
## 10 15.271921
## 11 14.306338
## 12 13.425388
## 13 11.995031
## 14 8.116282
## 15 6.061234
## 16 5.365458
## 17 4.598225
## 18 3.220943
## 19 0.998162
## 20 0.000000
In, our crocodile example however, our animal’s movements are bounded by the banks of the river and so cannot move directly (as the crow flies) between hydrophone stations. As a work around, the VTrack Package has a function GenerateCircuitousDistance() that calculates distances in series through a set of locations and waypoints to form an indirect path between receiver stations.
# Load the points file
data(PointsCircuitous_crocs)
# Generate the Circuitous Distance Matrix
CircuitousDM <- GenerateCircuitousDistance(PointsCircuitous_crocs)
# Now plot example of how circuitous distances between receivers were generated
# In this example an individual must follow the course of the river in order to
# move between receivers
PointsCircuitous_crocs %>%
ggplot(aes(x=LONGITUDE, y=LATITUDE)) +
xlab("LONGITUDE") +
ylab("LATITUDE") +
geom_path() +
geom_point() +
geom_point(data=PointsDirect_crocs,
aes(x=LONGITUDE, y=LATITUDE), color="red", size=3, shape=16) + # Plots the hydrophones in a different colour)
geom_point(data=PointsDirect_crocs[1,],
aes(x=LONGITUDE, y=LATITUDE), color="green", size=3, shape=16) # Plots 103561 (i.e. the most downstream hydrophone) in a different colour)We would then use this within the RunResidenceExtraction() function in VTrack to extract the movements between hydrophone stations and link this to our measure of distance travelled
Res<- RunResidenceExtraction(Vcrocs,
"STATIONNAME",
2,
60*60*12,
sDistanceMatrix=CircuitousDM)
# Our movements file
CrocMovements <- Res$nonresidencesFinally, many acoustic array designs will not fall in series like our above example and will insead be made up of multiple creeks and branches like this.
In this image the green area is the river and red crosses are the hydrophone locations. This function works by using a rasterised version of our study region (1 = water, 0 = land) then calculating the shortest (i.e. least cost) route transitioning from one cell to another to connect adjacent receiver stations. The function makes use of the gdistance R package and will work using any arrangement of acoustic arrays provided they are connected by water.
First we load the raster layer of our study area using the raster R package and convert this to a transition object
library(raster)
library(gdistance)
r5 <- raster("GIS/wenlock raster UTM.tif") # Load the raster
tr <- transition(r5, transitionFunction=mean, directions=8) # Create a Transition object from the rasterNext we load the tracking dataset and points file which uses a multi-branch array
library(VTrack)
barra <- read.csv('Data/barra.csv', header = TRUE)
barra[,1] <- as.POSIXct(barra[,1])
PointsLeastCost_crocs <- read.csv("Data/PointsLeastCost_crocs.csv")
# convert the raster to points for plotting
map.p <- rasterToPoints(r5)
df <- data.frame(map.p) # Make the points a dataframe for ggplot
colnames(df) <- c("X", "Y", "Water") # Make appropriate column headings
# Now plot the map
ggplot(data=df, aes(y=Y, x=X)) +
geom_raster(aes(fill=Water)) +
geom_point(data=POINTS3,
aes(x=X, y=Y), color="white", size=3, shape=16) +
theme_bw() +
coord_equal() Finally we run the GenerateRasterDistance() function in VTrack to generate the Distance matrix containing the Least cost paths.
GenerateLeastCostDistance(sPointsFile = POINTS3,
sTransition = tr)Generating distance matrices can also be a useful way of extracting the distance of each receiver stations to a specific location (i.e. receiver positioned at the mouth of an estuary, instream structure or other point of interest). These distances can then be merged with the original detection dataset and plotted through time to look at associations and commonality in the timing of animal movements.
DistDownstream <- CircuitousDM[,1:2] # i.e. distance from the the most downstream hydrophone - serial number 103561
names(DistDownstream) <- c("RECEIVERID","Distance.DS")
Vcrocsm <- merge(Vcrocs,DistDownstream,by="RECEIVERID")
Vcrocsm <- Vcrocsm[order(Vcrocsm$TRANSMITTERID,Vcrocsm$DATETIME),]
row.names(Vcrocsm) <- NULL
Vcrocsm <- Vcrocsm[,c("DATETIME","TRANSMITTERID","SENSOR1","UNITS1","RECEIVERID","STATIONNAME","Distance.DS")]
Vcrocsm %>%
ggplot(aes(x=DATETIME, y=Distance.DS,group=TRANSMITTERID,colour=TRANSMITTERID)) +
geom_line(alpha=0.5) +
scale_y_continuous(name="Distance from most downstream receiver (km)")+
xlab("Date") +
theme_bw()+
theme(legend.title = element_blank(),
legend.position='none',
axis.line = element_line(colour = "black"),
panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
panel.border = element_blank(),
panel.background = element_blank())Central to working with spatial data, is that these data have a coordinate reference system (CRS class) associated with it. Geographical CRS are expressed in degrees and associated with an ellipse, a prime meridian and a datum. Projected CRS are expressed in a measure of length and a chosen position on the earth, as well as the underlying ellipse, prime meridian and datum.
Most countries have multiple coordinate reference systems, and where they meet there is usually a big mess — this led to the collection by the European Petroleum Survey Group (EPSG) of a geodetic parameter dataset.
The EPSG list among other sources is used in the workhorse PROJ.4 library, and handles transformation of spatial positions between different CRS. This library is interfaced with R in the rgdal package, and the CRS class is defined partly in the sp package and partly in rgdal.
In the next step, we need to define the CRS which corresponds to our dataset. For the cassowary GPS dataset (cassdat.sp), we have not specified the CRS so the @pro4string slot is empty at the moment (= NA). We therefore need to refer to the correct proj4 string information which is contained within the rgdal package.
Our receiver coordinates were collected in the WGS 84 geographic datum in Decimal Degrees.
For simplicity, each projection can be referred to by a unique ID from the European Petroleum Survey Group (EPSG) geodetic parameter dataset. You can find the relevant EPSG code for your coordinate system from the website http://spatialreference.org. Here simply enter in a key word in the search box and select from the list the correct coordinate system. There is a map image in the top right of the site to help you.
The equivalent EPSG code for WGS 84 is 4326
to set the spatial projection we use the proj4string() function
WGS <- CRS("+init=epsg:4326")
proj4string(cassdat.sp) <- WGSIn order to calculate distances and areas correctly, we need to now transform our data to the correct spatial projection.
From the above graphic, can you see which is the correct projection for the Townsville area - here’s a hint
GDA <- CRS("+init=epsg:28355") # The equivalent EPSG code for WGS 84 is 28355.
cb.P <- spTransform(cb,GDA)Once you have the Spatial object the way you like it, you will want to export it to view in a GIS. Here we show you two options for exporting your Spatial object, as a shapefile or as a .kml for viewing in Google Earth. As with reading in spatial objects, there are a number of R packages out there to help you. Simply type ??kml or ??shapefile to look up a few of these options.
In this example we are goint to use the writeOGR() function in the rgdal package. to write the data containing the movement metrics to a shapefile.
The ld function allows to quickly convert an object of class ltraj to a data.frame. We then reproject our data back to Longs and Lats (WGS 1984) for viewing in Arc GIS or Google Earth
# Convert the ltraj object into a standard data frame
cassdat.lt.proj <- ld(cassdat.lt) # Change to a dataframe object
cassdat.lt.proj$date <- as.POSIXct(cassdat.lt.proj$date) # Reassigns date as a date object
# Convert the data frame object into a SpatialPointsDataFrame
coordinates(cassdat.lt.proj) <- ~x+y # Extract the coordinates
proj4string(cassdat.lt.proj) <- GDA # Assigns the coordinate reference system
# Change it back to WGS 1984 projection
cassdat.lt.WGS <- spTransform(cassdat.lt.proj,WGS)Finally we run the writeOGR function to write this file to disk. We specify the folder as being within the GIS folder.
writeOGR(cassdat.lt.WGS, # This field needs to be a SpatialPoints*, SpatialLines* or SpatialPolygons* object
dsn="GIS",
layer= "cass_points", driver="ESRI Shapefile",
dataset_options=c("NameField=id"))
Text to introduce tigershark data
Once you have the Spatial object the way you like it, you will want to export it to view in a GIS. Here we show you two options for exporting your Spatial object, as a shapefile or as a .kml for viewing in Google Earth. As with reading in spatial objects, there are a number of R packages out there to help you. Simply type ??kml or ??shapefile to look up a few of these options.
In this example we are goint to use the writeOGR() function in the rgdal package. to write the data containing the movement metrics to a shapefile.
The ld function allows to quickly convert an object of class ltraj to a data.frame. We then reproject our data back to Longs and Lats (WGS 1984) for viewing in Arc GIS or Google Earth
# Convert the ltraj object into a standard data frame
cassdat.lt.proj <- ld(cassdat.lt) # Change to a dataframe object
cassdat.lt.proj$date <- as.POSIXct(cassdat.lt.proj$date) # Reassigns date as a date object
# Convert the data frame object into a SpatialPointsDataFrame
coordinates(cassdat.lt.proj) <- ~x+y # Extract the coordinates
proj4string(cassdat.lt.proj) <- GDA # Assigns the coordinate reference system
# Change it back to WGS 1984 projection
cassdat.lt.WGS <- spTransform(cassdat.lt.proj,WGS)Finally we run the writeOGR function to write this file to disk. We specify the folder as being within the GIS folder.
writeOGR(cassdat.lt.WGS, # This field needs to be a SpatialPoints*, SpatialLines* or SpatialPolygons* object
dsn="GIS",
layer= "cass_points", driver="ESRI Shapefile",
dataset_options=c("NameField=id"))